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We announce the draft genome sequence of Mycobacterium austroafricanum DSM 44191 T ( = E9789-SA12441 T ), a non- 
tuberculosis species responsible for opportunistic infection. The genome described here has a size of 6,772,357 bp with a G+C 
content of 66.79% and contains 6,419 protein-coding genes and 112 RNA genes. 
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ycobacterium austroafricanum was named after the initial 
isolation of a set of 23 strains from water in South Africa ( 1 ) . 
Numerical taxonomy indicated that these isolates are representa- 
tive of a new species related to Mycobacterium parafortuitum (1). 
Further genetic analyses indicated that, in fact, M. austroafrica- 
num belongs to the Mycobacterium vaccae complex, which also 
contains Mycobacterium vanbaalenii, and is more distantly related 
to Mycobacterium aurum and Mycobacterium pyrenivorans (2). 
M. austroafricanum is an environmental organism isolated from 
soil, in particular, from hydrocarbon-polluted soils (3, 4). Indeed, 
M. austroafricanum has attracted much attention because it is 
able to degrade gasoline hydrocarbons (5, 6). M. austroafrica- 
num has rarely been isolated from patients (7, 8), and while 
M. austroafricanum DNA has been detected in diseased joint 
fluids, the clinical significance of M. austroafricanum has not 
yet been established (9). 

We therefore sequenced the whole genome of the M. austroaf- 
ricanum DSM 44191 (E9789-SA12241 T ) strain in order to illus- 
trate its phylogenetic relationship with closely related mycobacte- 
ria and to help depict its unique metabolic capabilities. 

Genomic DNA was isolated from M. austroafricanum grown in 
Middlebrook 7H9 broth (Becton Dickinson, Sparks, MD) at 37°C. 
It was then sequenced using three high throughput NGS technol- 
ogies: Roche 454 (Roche Diagnostics Corporation, Indianapolis, 
IN) ( 10), SOLiD version 4 (Life Technologies, Carlsbad, CA), and 
MiSeq Illumina (Illumina Inc., San Diego, CA). A 3.74-kb paired- 
end library was loaded on a picotiter plate and sequenced with the 
Roche-GS FLX Titanium Sequencing Kit XLR70. The run yielded 
143.9 Mb with 435,968 passed filters and an average length of 
329 bp. The bar-coded paired-end SOLiD library generated 
975,705 reads of 50 X 35-bp length. Finally, a paired-end Nextera 
library, fragmented at 800 bp, sequenced on MiSeq at 2X 151 bp, 
yielded 823,878 reads with an indexing of 6.57% on the flowcell. 

Reads from these various sequencing technologies were first 
assembled separately. The 454 reads were then assembled into 
contigs and scaffolds using Newbler version 2.8 (Roche). Illumina 
reads were trimmed using Trimmomatic (11), then assembled 
using the Spades software (12, 13). The obtained contigs were 



combined by SSPACE (14) and Opera software vl.2 (15) comple- 
mented by GapFiller vl.10 (16). The genome was improved using 
CLC Genomics v5 software (CLC bio, Aarhus, Denmark). 

The M. austroafricanum draft genome sequence consists of 20 
scaffolds of 69 contigs containing 6,682,536 bp, with an estimated 
genome size including gaps of 6,772,357 bp. The G+C content of 
this genome is 66.79%. Noncoding genes and miscellaneous fea- 
tures were predicted using RNAmmer (17), ARAGORN (18), 
Rfam (19), and PFAM (20). Open reading frames were predicted 
using Prodigal (21), and functional annotation was achieved using 
BLASTp against the GenBank database (22) and the Clusters of 
Orthologous Groups (COG) database (23, 24). The genome was 
shown to encode at least 112 predicted RNAs, including 4 rRNAs, 
85 tRNAs, 2 transfer-messenger RNAs, and 21 miscellaneous 
RNAs. Also, 6,419 genes were identified, which yields a coding 
capacity of 6,124,875 bp (coding percentage, 90.4%). Among 
these genes, 850 (12.77%) were found to encode putative proteins 
and 1,324 (16.88%) were assigned as encoding hypothetical pro- 
teins. Moreover, 6,549 genes matched at least one sequence in the 
COG database with BLASTp default parameters. 

Nucleotide sequence accession numbers. The Mycobacterium 
austroafricanum strain DSM 44191 (= E9789-SA12441 T ) genome 
sequence has been deposited at DDBJ/EMBL/GenBank under the 
accession no. HG964450 to HG964469. The whole-genome shot- 
gun master numbers are CCAW010000001 to CCAW0 10000069. 
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